Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Stretch (hello-robot) #409

Merged
merged 45 commits into from
Oct 4, 2024

Conversation

aliberts
Copy link
Collaborator

@aliberts aliberts commented Sep 4, 2024

What this does

Adds native support for Stretch 3

  • Adds StretchRobot and associated config for a standard Stretch 3
  • Adds support for rotating images from camera streams (some cameras on Stretch are mounted sideways)
  • Adds examples/8_use_stretch.md

TODO in this PR:

  • Teleop / record is currently not working when streaming Stretch's GUI through Moonlight
  • Try with tethered setup as well
  • Add eval mode
  • Add documentation

TODO in future PRs:

  • Add tests (add stretch in init.py)
  • Maybe save "real" robot's action instead of gamepad state as action?
  • Feature: Add the option to teleoperate during environment reset
  • Refactor: Move robot-specific logic of control_robot in their respective classes
  • Refactor: Naming the motors and controller's state in the recorded datasets and have them display nicely in independent columns on the visualization
  • Plumbing:
    • Add observation.images shapes in datasets metadata
    • Make sure those shapes are retrieved from there to properly size the policies' visual backbones and not from the robots config (the reason being that the width/height there can be wrong because of rotations)

How it was tested

I tested calibration, teleoperation and recoding in the 3 setups (tethered, untethered and ssh). I also tested replaying an episode with the tethered setup.

How to checkout & try? (for the reviewer)

  • Calibration (homing):
python lerobot/scripts/control_robot.py calibrate \
    --robot-path lerobot/configs/robot/stretch.yaml
  • Teleoperation:
python lerobot/scripts/control_robot.py teleoperate \
    --robot-path lerobot/configs/robot/stretch.yaml
  • Dataset recording:
python lerobot/scripts/control_robot.py record \
    --robot-path lerobot/configs/robot/stretch.yaml \
    --fps 30 \
    --root data \
    --repo-id ${HF_USER}/stretch_test \
    --tags stretch tutorial \
    --warmup-time-s 3 \
    --episode-time-s 40 \
    --reset-time-s 10 \
    --num-episodes 2 \
    --push-to-hub 0
  • Replay:
python lerobot/scripts/control_robot.py replay \
    --robot-path lerobot/configs/robot/stretch.yaml \
    --fps 30 \
    --root data \
    --repo-id ${HF_USER}/stretch_test \
    --episode 0

@aliberts aliberts added the 🌍 Real world Real-world robotics & controls label Sep 4, 2024
@aliberts aliberts self-assigned this Sep 4, 2024
@aliberts aliberts force-pushed the user/aliberts/2024_09_03_add_stretch branch 2 times, most recently from 2381002 to 4e531f0 Compare September 6, 2024 10:53
@aliberts aliberts requested a review from Cadene September 10, 2024 17:11
@aliberts aliberts marked this pull request as ready for review September 10, 2024 17:12
Copy link
Collaborator

@Cadene Cadene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass. Still reviewing

lerobot/common/datasets/compute_stats.py Outdated Show resolved Hide resolved
lerobot/common/robot_devices/utils.py Outdated Show resolved Hide resolved
Comment on lines 31 to 48
def find_camera_indices(raise_when_empty=True) -> list[int]:
def find_cameras_info(raise_when_empty=True) -> dict[int, str]:
"""
Find the serial numbers of the Intel RealSense cameras
Find the names and the serial numbers of the Intel RealSense cameras
connected to the computer.
"""
camera_ids = []
cameras_info = {}
for device in rs.context().query_devices():
serial_number = int(device.get_info(rs.camera_info(SERIAL_NUMBER_INDEX)))
camera_ids.append(serial_number)
name = device.get_info(rs.camera_info.name)
cameras_info[serial_number] = name

if raise_when_empty and len(camera_ids) == 0:
if raise_when_empty and len(cameras_info) == 0:
raise OSError(
"Not a single camera was detected. Try re-plugging, or re-installing `librealsense` and its python wrapper `pyrealsense2`, or updating the firmware."
)

return camera_ids
return cameras_info
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not fan of modifying this function, to stay consistent with the one in opencv.py
Instead, could we have another function that given camera_indices / serial number retrieves the camera name?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is that we would still need to iterate through rs.context().query_devices() in that function and compare device.get_info(rs.camera_info(SERIAL_NUMBER_INDEX)) against the given serial_number before accessing the name. We can do it but:

    1. Would makes things harder to read and understand why it's doing those convolutions
    1. Would be slower as you'd iterate twice on the devices

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What can we do to unify opencv and intelrealsense on this function?

I would imagine find_cameras_info(raise_when_empty=True) -> list[dict]: for both, with camera_infos[0]["index"] being the camera index port for opencv or intelrealsense serial, and camera_infos[0]["name"] not existing for opencv.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: camera_infos instead of cameras_info for consistency with camera_ids and how we usually add a s at the end for our variable names of type list

Copy link
Collaborator Author

@aliberts aliberts Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What can we do to unify opencv and intelrealsense on this function?

As we discussed, I think we should eventually converge to a single Camera class with different backends (realsense, opencv...). This would probably simplify the design and avoid having to monkeypatch those classes in the tests. Then we could think about the design of this function (which would be a single one). Wdyt?

Nit: camera_infos instead of cameras_info for consistency with camera_ids and how we usually add a s at the end for our variable names of type list

I wondered about the exact same thing but the plural of "info" or "information" in English is without a s.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I agree, but I think we should homogenise opencv and intelrealsense at this level before merging. What do you think of my suggestion?

  2. I know but we dont write in english, we write in python (which is an english dialect) so the english grammar rules dont apply the same way ^^

@aliberts aliberts force-pushed the user/aliberts/2024_09_03_add_stretch branch from 50aa25c to dee3b0a Compare September 12, 2024 09:05
@aliberts aliberts force-pushed the user/aliberts/2024_09_03_add_stretch branch from dee3b0a to 88526d0 Compare September 12, 2024 12:34
@aliberts aliberts requested a review from Cadene September 25, 2024 14:15
Copy link

@hello-binit hello-binit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @aliberts, looks great! I've only done a read-through so far. I can run it on actual hardware later this week. I don't want to delay your PR, so don't feel obliged to wait.

examples/8_use_stretch.md Show resolved Hide resolved
```
This is essentially the same as running `stretch_gamepad_teleop.py`

Store your Hugging Face repository name in a variable to run these commands:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a huggingface-cli login step?

Copy link
Collaborator Author

@aliberts aliberts Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes you're correct. Technically, it's already pointed out in the 7_get_started_with_real_robot.md as some other details and I didn't want to add to avoid too much redundancy. I've added a word on that (7c87025)
Eventually we might rearrange these markdown as the number of robots grow (cc @Cadene)

if camera_ids is None:
camera_ids = find_camera_indices()
if len(camera_ids) == 0:
camera_ids = find_cameras_info()

print("Connecting cameras")
cameras = []
for cam_idx in camera_ids:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the variable should be renamed to cameras_info

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, this was because it was copied from a very similar function for opencv cameras (see this discussion).
I'll wait for that discussion to resolve before changing it.

@@ -237,6 +248,36 @@ def __init__(
self.depth_map = None
self.logs = {}

# TODO(alibets): Do we keep original width/height or do we define them after rotation?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely nice to keep width/height consistent with the rotated image

self.is_connected = False
self.teleop = None
self.logs = {}
# TODO(aliberts): remove original low-level logging from stretch

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add

https://github.com/hello-robot/stretch_body/blob/5ef6564dcdd93fb8120e44ad4d34017617210142/tools/bin/stretch_configure_tool.py#L2-L4

at the top of the file to raise the logging level. I might be able to also add a boolean to outright disable Stretch Body logging if it still interferes with your loggers. Let me know.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tried that in the past and it didn't work, but somehow since then the logs from Stretch have disappeared and I'm not exactly sure why 😅
I've added it e0fa5e4

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's interesting 😂. I'll let you know if I see them when I run it.

Comment on lines +98 to +121
if self.state_keys is None:
self.state_keys = list(state)

if not record_data:
return

state = torch.as_tensor(list(state.values()))
action = torch.as_tensor(list(action.values()))

# Capture images from cameras
images = {}
for name in self.cameras:
before_camread_t = time.perf_counter()
images[name] = self.cameras[name].async_read()
images[name] = torch.from_numpy(images[name])
self.logs[f"read_camera_{name}_dt_s"] = self.cameras[name].logs["delta_timestamp_s"]
self.logs[f"async_read_camera_{name}_dt_s"] = time.perf_counter() - before_camread_t

# Populate output dictionnaries
obs_dict, action_dict = {}, {}
obs_dict["observation.state"] = state
action_dict["action"] = action
for name in self.cameras:
obs_dict[f"observation.images.{name}"] = images[name]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lot of duplicate code. Could call self.get_observation() instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cadene I know we can't do that for send_action but after looking at it, I don't see why we can't simply call self.get_observation() in teleop_step. I've seen your comment in manipulator.py that get_observation doesn't return a batch dim but so does teleop_step.
Wdyt, am I missing something here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean self.capture_observation() ?

I am fine with this duplicated code for now. We have the exact same pattern in manipulator.py. We will refactor at some point.

We can add a TODO(rcadene, aliberts): refactor with self.capture_observation()


def print_logs(self) -> None:
...
# TODO(aliberts): move robot-specific logs logic here

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for log in self.logs:
    self.logger.info(f"{log}: {self.logs[log]}")

i think

lerobot/common/robot_devices/utils.py Outdated Show resolved Hide resolved
robot_type: stretch3

cameras:
# TODO(aliberts): hardware currently not working, uncomment and test this when fixed

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did this get resolved? This is specific to your robot, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacement board is on the way ;)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad to hear it!

examples/8_use_stretch.md Show resolved Hide resolved
Copy link
Collaborator

@Cadene Cadene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, almost ready to merge

Comment on lines 684 to 686
def print_logs(self):
...
# TODO(aliberts): move robot-specific logs logic here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use pass instead of ...?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done eb48452

Comment on lines 192 to 194
def print_logs(self) -> None:
...
# TODO(aliberts): move robot-specific logs logic here
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same pass?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done eb48452

lerobot/common/robot_devices/robots/utils.py Show resolved Hide resolved
pyproject.toml Outdated Show resolved Hide resolved
.github/workflows/test.yml Show resolved Hide resolved
Comment on lines 31 to 48
def find_camera_indices(raise_when_empty=True) -> list[int]:
def find_cameras_info(raise_when_empty=True) -> dict[int, str]:
"""
Find the serial numbers of the Intel RealSense cameras
Find the names and the serial numbers of the Intel RealSense cameras
connected to the computer.
"""
camera_ids = []
cameras_info = {}
for device in rs.context().query_devices():
serial_number = int(device.get_info(rs.camera_info(SERIAL_NUMBER_INDEX)))
camera_ids.append(serial_number)
name = device.get_info(rs.camera_info.name)
cameras_info[serial_number] = name

if raise_when_empty and len(camera_ids) == 0:
if raise_when_empty and len(cameras_info) == 0:
raise OSError(
"Not a single camera was detected. Try re-plugging, or re-installing `librealsense` and its python wrapper `pyrealsense2`, or updating the firmware."
)

return camera_ids
return cameras_info
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What can we do to unify opencv and intelrealsense on this function?

I would imagine find_cameras_info(raise_when_empty=True) -> list[dict]: for both, with camera_infos[0]["index"] being the camera index port for opencv or intelrealsense serial, and camera_infos[0]["name"] not existing for opencv.

Comment on lines 31 to 48
def find_camera_indices(raise_when_empty=True) -> list[int]:
def find_cameras_info(raise_when_empty=True) -> dict[int, str]:
"""
Find the serial numbers of the Intel RealSense cameras
Find the names and the serial numbers of the Intel RealSense cameras
connected to the computer.
"""
camera_ids = []
cameras_info = {}
for device in rs.context().query_devices():
serial_number = int(device.get_info(rs.camera_info(SERIAL_NUMBER_INDEX)))
camera_ids.append(serial_number)
name = device.get_info(rs.camera_info.name)
cameras_info[serial_number] = name

if raise_when_empty and len(camera_ids) == 0:
if raise_when_empty and len(cameras_info) == 0:
raise OSError(
"Not a single camera was detected. Try re-plugging, or re-installing `librealsense` and its python wrapper `pyrealsense2`, or updating the firmware."
)

return camera_ids
return cameras_info
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: camera_infos instead of cameras_info for consistency with camera_ids and how we usually add a s at the end for our variable names of type list

@Cadene Cadene merged commit 1a343c3 into main Oct 4, 2024
7 checks passed
@Cadene Cadene deleted the user/aliberts/2024_09_03_add_stretch branch October 4, 2024 16:56
amandip7 pushed a commit to amandip7/lerobot that referenced this pull request Oct 10, 2024
Co-authored-by: Remi <remi.cadene@huggingface.co>
Co-authored-by: Remi Cadene <re.cadene@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🌍 Real world Real-world robotics & controls
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants